Overview

Dataset statistics

Number of variables17
Number of observations392735
Missing cells320070
Missing cells (%)4.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory187.0 MiB
Average record size in memory499.2 B

Variable types

Categorical4
Numeric10
Boolean3

Alerts

OriginalCreditor_Redacted has a high cardinality: 52 distinct valuesHigh cardinality
CurrentBalance is highly overall correlated with DebtLoadPrincipal and 1 other fieldsHigh correlation
DebtLoadPrincipal is highly overall correlated with CurrentBalance and 1 other fieldsHigh correlation
BalanceAtDebtLoad is highly overall correlated with CurrentBalance and 1 other fieldsHigh correlation
PurchasePrice is highly overall correlated with FrequencyEncodedCreditor and 1 other fieldsHigh correlation
NumPhones is highly overall correlated with IsStatBarredHigh correlation
FrequencyEncodedCreditor is highly overall correlated with PurchasePrice and 2 other fieldsHigh correlation
OriginalCreditor_Redacted is highly overall correlated with PurchasePrice and 2 other fieldsHigh correlation
ProductOrDebtType is highly overall correlated with FrequencyEncodedCreditor and 1 other fieldsHigh correlation
CollectionStatus is highly overall correlated with IsStatBarred and 1 other fieldsHigh correlation
IsStatBarred is highly overall correlated with NumPhones and 1 other fieldsHigh correlation
IsLegal is highly overall correlated with CollectionStatusHigh correlation
InBankruptcy is highly imbalanced (85.4%)Imbalance
IsLegal is highly imbalanced (83.3%)Imbalance
NumLiableParties is highly imbalanced (93.6%)Imbalance
LastPaymentAmount has 290802 (74.0%) missing valuesMissing
CustomerAge has 26490 (6.7%) missing valuesMissing
CurrentBalance is highly skewed (γ1 = 26.66167707)Skewed
DebtLoadPrincipal is highly skewed (γ1 = 37.04147046)Skewed
BalanceAtDebtLoad is highly skewed (γ1 = 34.7372639)Skewed
CurrentBalance has 68045 (17.3%) zerosZeros
NumPhones has 266389 (67.8%) zerosZeros
NumEmails has 312660 (79.6%) zerosZeros
NumAddresses has 76185 (19.4%) zerosZeros

Reproduction

Analysis started2023-10-17 05:40:39.698113
Analysis finished2023-10-17 05:41:15.050069
Duration35.35 seconds
Software versionpandas-profiling v3.6.6
Download configurationconfig.json

Variables

OriginalCreditor_Redacted
Categorical

HIGH CARDINALITY  HIGH CORRELATION 

Distinct52
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size28.4 MiB
Creditor 17
82764 
Creditor 47
54845 
Creditor 33
52340 
Creditor 48
24854 
Creditor 19
20454 
Other values (47)
157478 

Length

Max length11
Median length11
Mean length10.939399
Min length10

Characters and Unicode

Total characters4296285
Distinct characters18
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st rowCreditor 1
2nd rowCreditor 2
3rd rowCreditor 1
4th rowCreditor 2
5th rowCreditor 1

Common Values

ValueCountFrequency (%)
Creditor 17 82764
21.1%
Creditor 47 54845
14.0%
Creditor 33 52340
13.3%
Creditor 48 24854
 
6.3%
Creditor 19 20454
 
5.2%
Creditor 10 18922
 
4.8%
Creditor 44 17242
 
4.4%
Creditor 25 16443
 
4.2%
Creditor 42 15933
 
4.1%
Creditor 35 14474
 
3.7%
Other values (42) 74464
19.0%

Length

2023-10-17T11:11:15.127274image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
creditor 392735
50.0%
17 82764
 
10.5%
47 54845
 
7.0%
33 52340
 
6.7%
48 24854
 
3.2%
19 20454
 
2.6%
10 18922
 
2.4%
44 17242
 
2.2%
25 16443
 
2.1%
42 15933
 
2.0%
Other values (43) 88938
 
11.3%

Most occurring characters

ValueCountFrequency (%)
r 785470
18.3%
C 392735
9.1%
e 392735
9.1%
d 392735
9.1%
i 392735
9.1%
t 392735
9.1%
o 392735
9.1%
392735
9.1%
4 151858
 
3.5%
7 151524
 
3.5%
Other values (8) 458288
10.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2749145
64.0%
Decimal Number 761670
 
17.7%
Uppercase Letter 392735
 
9.1%
Space Separator 392735
 
9.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 151858
19.9%
7 151524
19.9%
3 137973
18.1%
1 128565
16.9%
5 48663
 
6.4%
2 38353
 
5.0%
0 36058
 
4.7%
8 33075
 
4.3%
9 29700
 
3.9%
6 5901
 
0.8%
Lowercase Letter
ValueCountFrequency (%)
r 785470
28.6%
e 392735
14.3%
d 392735
14.3%
i 392735
14.3%
t 392735
14.3%
o 392735
14.3%
Uppercase Letter
ValueCountFrequency (%)
C 392735
100.0%
Space Separator
ValueCountFrequency (%)
392735
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3141880
73.1%
Common 1154405
 
26.9%

Most frequent character per script

Common
ValueCountFrequency (%)
392735
34.0%
4 151858
 
13.2%
7 151524
 
13.1%
3 137973
 
12.0%
1 128565
 
11.1%
5 48663
 
4.2%
2 38353
 
3.3%
0 36058
 
3.1%
8 33075
 
2.9%
9 29700
 
2.6%
Latin
ValueCountFrequency (%)
r 785470
25.0%
C 392735
12.5%
e 392735
12.5%
d 392735
12.5%
i 392735
12.5%
t 392735
12.5%
o 392735
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4296285
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 785470
18.3%
C 392735
9.1%
e 392735
9.1%
d 392735
9.1%
i 392735
9.1%
t 392735
9.1%
o 392735
9.1%
392735
9.1%
4 151858
 
3.5%
7 151524
 
3.5%
Other values (8) 458288
10.7%

CurrentBalance
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct177653
Distinct (%)45.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1342.9503
Minimum-7717.2
Maximum441681.52
Zeros68045
Zeros (%)17.3%
Negative2717
Negative (%)0.7%
Memory size6.0 MiB
2023-10-17T11:11:15.544621image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum-7717.2
5-th percentile0
Q1106.665
median486.89
Q31204.77
95-th percentile5387.97
Maximum441681.52
Range449398.72
Interquartile range (IQR)1098.105

Descriptive statistics

Standard deviation4093.788
Coefficient of variation (CV)3.048354
Kurtosis1578.4783
Mean1342.9503
Median Absolute Deviation (MAD)464.02
Skewness26.661677
Sum5.274236 × 108
Variance16759100
MonotonicityNot monotonic
2023-10-17T11:11:15.644943image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 68045
 
17.3%
50 172
 
< 0.1%
30 156
 
< 0.1%
65.63 134
 
< 0.1%
40.68 130
 
< 0.1%
60 130
 
< 0.1%
32.35 124
 
< 0.1%
20 123
 
< 0.1%
45.29 122
 
< 0.1%
59.38 118
 
< 0.1%
Other values (177643) 323481
82.4%
ValueCountFrequency (%)
-7717.2 1
< 0.1%
-7436.41 1
< 0.1%
-6613.66 1
< 0.1%
-5230.35 1
< 0.1%
-4928.52 1
< 0.1%
-4920.03 1
< 0.1%
-4701.68 1
< 0.1%
-4655.43 1
< 0.1%
-4479.42 1
< 0.1%
-4388.03 1
< 0.1%
ValueCountFrequency (%)
441681.52 1
< 0.1%
425064.21 1
< 0.1%
384946.87 1
< 0.1%
315137.08 1
< 0.1%
293571.08 1
< 0.1%
286287.29 1
< 0.1%
285475.44 1
< 0.1%
261650.56 1
< 0.1%
239452.19 1
< 0.1%
226832.93 1
< 0.1%

DebtLoadPrincipal
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct195921
Distinct (%)49.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1588.3375
Minimum0
Maximum844343
Zeros152
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size6.0 MiB
2023-10-17T11:11:15.761261image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile51
Q1272.86
median647.84
Q31444.54
95-th percentile5945.136
Maximum844343
Range844343
Interquartile range (IQR)1171.68

Descriptive statistics

Standard deviation4484.2173
Coefficient of variation (CV)2.8232145
Kurtosis4158.01
Mean1588.3375
Median Absolute Deviation (MAD)457.71
Skewness37.04147
Sum6.2379572 × 108
Variance20108205
MonotonicityNot monotonic
2023-10-17T11:11:15.859666image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
32 1103
 
0.3%
25 1100
 
0.3%
47 725
 
0.2%
45 631
 
0.2%
20 605
 
0.2%
30 557
 
0.1%
27 524
 
0.1%
40 433
 
0.1%
36 431
 
0.1%
60 410
 
0.1%
Other values (195911) 386216
98.3%
ValueCountFrequency (%)
0 152
< 0.1%
0.01 10
 
< 0.1%
0.02 2
 
< 0.1%
0.03 1
 
< 0.1%
0.07 1
 
< 0.1%
0.08 1
 
< 0.1%
0.09 1
 
< 0.1%
0.11 2
 
< 0.1%
0.16 1
 
< 0.1%
0.21 2
 
< 0.1%
ValueCountFrequency (%)
844343 1
< 0.1%
425064.21 1
< 0.1%
384946.87 1
< 0.1%
383878.71 1
< 0.1%
293571.08 1
< 0.1%
286287.29 1
< 0.1%
285475.44 1
< 0.1%
274006.16 1
< 0.1%
261650.56 1
< 0.1%
245937.19 1
< 0.1%

BalanceAtDebtLoad
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct198413
Distinct (%)50.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1652.2256
Minimum0
Maximum844343
Zeros122
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size6.0 MiB
2023-10-17T11:11:15.972085image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile56.16
Q1276.225
median661.02
Q31487.53
95-th percentile6227.55
Maximum844343
Range844343
Interquartile range (IQR)1211.305

Descriptive statistics

Standard deviation4601.4435
Coefficient of variation (CV)2.7849971
Kurtosis3756.8049
Mean1652.2256
Median Absolute Deviation (MAD)468.52
Skewness34.737264
Sum6.4888683 × 108
Variance21173282
MonotonicityNot monotonic
2023-10-17T11:11:16.078672image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
40.68 394
 
0.1%
45.29 391
 
0.1%
32.35 358
 
0.1%
30 357
 
0.1%
39.3 331
 
0.1%
62.94 330
 
0.1%
65.63 329
 
0.1%
50 303
 
0.1%
59.38 291
 
0.1%
90.63 287
 
0.1%
Other values (198403) 389364
99.1%
ValueCountFrequency (%)
0 122
< 0.1%
0.01 10
 
< 0.1%
0.02 2
 
< 0.1%
0.03 1
 
< 0.1%
0.07 1
 
< 0.1%
0.08 1
 
< 0.1%
0.09 1
 
< 0.1%
0.11 2
 
< 0.1%
0.16 1
 
< 0.1%
0.21 2
 
< 0.1%
ValueCountFrequency (%)
844343 1
< 0.1%
425064.21 1
< 0.1%
384946.87 1
< 0.1%
383878.71 1
< 0.1%
293571.08 1
< 0.1%
286287.29 1
< 0.1%
285475.44 1
< 0.1%
274006.16 1
< 0.1%
261650.56 1
< 0.1%
245937.19 1
< 0.1%

PurchasePrice
Real number (ℝ)

Distinct48
Distinct (%)< 0.1%
Missing2656
Missing (%)0.7%
Infinite0
Infinite (%)0.0%
Mean5.6738133
Minimum0.19
Maximum52.18
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.0 MiB
2023-10-17T11:11:16.176914image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0.19
5-th percentile2.32
Q13.07
median4.22
Q36.59
95-th percentile11.7
Maximum52.18
Range51.99
Interquartile range (IQR)3.52

Descriptive statistics

Standard deviation5.5238955
Coefficient of variation (CV)0.97357725
Kurtosis46.780741
Mean5.6738133
Median Absolute Deviation (MAD)1.15
Skewness6.0923307
Sum2213235.4
Variance30.513421
MonotonicityNot monotonic
2023-10-17T11:11:16.279882image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=48)
ValueCountFrequency (%)
3.07 82760
21.1%
2.32 39859
10.1%
3.6 35115
8.9%
7.38 34213
8.7%
5.72 24854
 
6.3%
4.22 24035
 
6.1%
4.96 18865
 
4.8%
6.59 17224
 
4.4%
4.31 16575
 
4.2%
11.7 16435
 
4.2%
Other values (38) 80144
20.4%
ValueCountFrequency (%)
0.19 853
 
0.2%
0.65 310
 
0.1%
1.44 1325
 
0.3%
1.77 355
 
0.1%
1.84 2762
 
0.7%
2.32 39859
10.1%
2.35 106
 
< 0.1%
3.07 82760
21.1%
3.6 35115
8.9%
3.87 8449
 
2.2%
ValueCountFrequency (%)
52.18 3782
1.0%
32.27 246
 
0.1%
25.41 1158
 
0.3%
16.7 434
 
0.1%
15.31 3521
0.9%
15 1
 
< 0.1%
14.83 1542
0.4%
12 3
 
< 0.1%
11.73 1
 
< 0.1%
11.71 2
 
< 0.1%
Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size31.1 MiB
Utilities/Telco - Other
207110 
Other
75586 
Finance Company - Other
48693 
Store Cards
 
17699
Credit Cards
 
16885
Other values (5)
26762 

Length

Max length23
Median length23
Mean length17.988575
Min length5

Characters and Unicode

Total characters7064743
Distinct characters32
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowOther
2nd rowOther
3rd rowOther
4th rowOther
5th rowOther

Common Values

ValueCountFrequency (%)
Utilities/Telco - Other 207110
52.7%
Other 75586
 
19.2%
Finance Company - Other 48693
 
12.4%
Store Cards 17699
 
4.5%
Credit Cards 16885
 
4.3%
Bank - Other 13030
 
3.3%
Residential Electricity 7693
 
2.0%
Personal Loans 4309
 
1.1%
Loans 1260
 
0.3%
Hire Purchase 470
 
0.1%

Length

2023-10-17T11:11:16.374322image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-17T11:11:16.501779image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
other 344419
33.6%
268833
26.2%
utilities/telco 207110
20.2%
finance 48693
 
4.7%
company 48693
 
4.7%
cards 34584
 
3.4%
store 17699
 
1.7%
credit 16885
 
1.6%
bank 13030
 
1.3%
residential 7693
 
0.7%
Other values (5) 18511
 
1.8%

Most occurring characters

ValueCountFrequency (%)
e 870244
12.3%
t 816302
11.6%
i 718150
 
10.2%
633415
 
9.0%
l 433915
 
6.1%
r 426529
 
6.0%
h 344889
 
4.9%
O 344419
 
4.9%
o 283380
 
4.0%
c 271659
 
3.8%
Other values (22) 1921841
27.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4990958
70.6%
Uppercase Letter 964427
 
13.7%
Space Separator 633415
 
9.0%
Dash Punctuation 268833
 
3.8%
Other Punctuation 207110
 
2.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 870244
17.4%
t 816302
16.4%
i 718150
14.4%
l 433915
8.7%
r 426529
8.5%
h 344889
 
6.9%
o 283380
 
5.7%
c 271659
 
5.4%
s 259735
 
5.2%
n 176680
 
3.5%
Other values (7) 389475
7.8%
Uppercase Letter
ValueCountFrequency (%)
O 344419
35.7%
U 207110
21.5%
T 207110
21.5%
C 100162
 
10.4%
F 48693
 
5.0%
S 17699
 
1.8%
B 13030
 
1.4%
R 7693
 
0.8%
E 7693
 
0.8%
L 5569
 
0.6%
Other values (2) 5249
 
0.5%
Space Separator
ValueCountFrequency (%)
633415
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 268833
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 207110
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5955385
84.3%
Common 1109358
 
15.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 870244
14.6%
t 816302
13.7%
i 718150
12.1%
l 433915
 
7.3%
r 426529
 
7.2%
h 344889
 
5.8%
O 344419
 
5.8%
o 283380
 
4.8%
c 271659
 
4.6%
s 259735
 
4.4%
Other values (19) 1186163
19.9%
Common
ValueCountFrequency (%)
633415
57.1%
- 268833
24.2%
/ 207110
 
18.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7064743
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 870244
12.3%
t 816302
11.6%
i 718150
 
10.2%
633415
 
9.0%
l 433915
 
6.1%
r 426529
 
6.0%
h 344889
 
4.9%
O 344419
 
4.9%
o 283380
 
4.0%
c 271659
 
3.8%
Other values (22) 1921841
27.2%

CollectionStatus
Categorical

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size27.3 MiB
ACTIVE
167445 
PASSIVE
128651 
PAID_IN_FULL
67043 
CLOSED
 
13524
CANCELLED_WITHDRAWN
 
5401
Other values (7)
 
10671

Length

Max length19
Median length17
Mean length7.7671789
Min length5

Characters and Unicode

Total characters3050443
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPAID_IN_FULL
2nd rowCANCELLED_WITHDRAWN
3rd rowPAID_IN_FULL
4th rowPASSIVE
5th rowPAID_IN_FULL

Common Values

ValueCountFrequency (%)
ACTIVE 167445
42.6%
PASSIVE 128651
32.8%
PAID_IN_FULL 67043
17.1%
CLOSED 13524
 
3.4%
CANCELLED_WITHDRAWN 5401
 
1.4%
UNDER_ARRANGEMENT 4237
 
1.1%
SETTLED FOR LESS 4191
 
1.1%
LEGAL 1559
 
0.4%
LEGAL_ARRANGEMENT 361
 
0.1%
NON_COLLECTION 237
 
0.1%
Other values (2) 86
 
< 0.1%

Length

2023-10-17T11:11:16.619509image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
active 167445
41.7%
passive 128651
32.1%
paid_in_full 67043
16.7%
closed 13524
 
3.4%
cancelled_withdrawn 5401
 
1.3%
under_arrangement 4237
 
1.1%
settled 4191
 
1.0%
for 4191
 
1.0%
less 4191
 
1.0%
legal 1559
 
0.4%
Other values (4) 684
 
0.2%

Most occurring characters

ValueCountFrequency (%)
I 435906
14.3%
A 385057
12.6%
E 348612
11.4%
V 296096
9.7%
S 283399
9.3%
P 195721
6.4%
C 192245
6.3%
T 186063
6.1%
L 171167
 
5.6%
_ 144322
 
4.7%
Other values (11) 411855
13.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2897739
95.0%
Connector Punctuation 144322
 
4.7%
Space Separator 8382
 
0.3%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I 435906
15.0%
A 385057
13.3%
E 348612
12.0%
V 296096
10.2%
S 283399
9.8%
P 195721
6.8%
C 192245
6.6%
T 186063
6.4%
L 171167
 
5.9%
D 99883
 
3.4%
Other values (9) 303590
10.5%
Connector Punctuation
ValueCountFrequency (%)
_ 144322
100.0%
Space Separator
ValueCountFrequency (%)
8382
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2897739
95.0%
Common 152704
 
5.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 435906
15.0%
A 385057
13.3%
E 348612
12.0%
V 296096
10.2%
S 283399
9.8%
P 195721
6.8%
C 192245
6.6%
T 186063
6.4%
L 171167
 
5.9%
D 99883
 
3.4%
Other values (9) 303590
10.5%
Common
ValueCountFrequency (%)
_ 144322
94.5%
8382
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3050443
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 435906
14.3%
A 385057
12.6%
E 348612
11.4%
V 296096
9.7%
S 283399
9.3%
P 195721
6.4%
C 192245
6.3%
T 186063
6.1%
L 171167
 
5.6%
_ 144322
 
4.7%
Other values (11) 411855
13.5%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.4 MiB
True
273082 
False
119653 
ValueCountFrequency (%)
True 273082
69.5%
False 119653
30.5%
2023-10-17T11:11:16.712731image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.4 MiB
False
384586 
True
 
8149
ValueCountFrequency (%)
False 384586
97.9%
True 8149
 
2.1%
2023-10-17T11:11:16.790681image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

IsLegal
Boolean

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.4 MiB
False
383021 
True
 
9714
ValueCountFrequency (%)
False 383021
97.5%
True 9714
 
2.5%
2023-10-17T11:11:16.863748image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

LastPaymentAmount
Real number (ℝ)

Distinct25874
Distinct (%)25.4%
Missing290802
Missing (%)74.0%
Infinite0
Infinite (%)0.0%
Mean288.86777
Minimum0.01
Maximum73131.84
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.0 MiB
2023-10-17T11:11:16.951468image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0.01
5-th percentile5.386
Q120
median40
Q3148.68
95-th percentile1199.814
Maximum73131.84
Range73131.83
Interquartile range (IQR)128.68

Descriptive statistics

Standard deviation1130.4537
Coefficient of variation (CV)3.9133951
Kurtosis459.8578
Mean288.86777
Median Absolute Deviation (MAD)30
Skewness15.40208
Sum29445158
Variance1277925.6
MonotonicityNot monotonic
2023-10-17T11:11:17.052700image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20 9750
 
2.5%
10 7538
 
1.9%
50 4545
 
1.2%
30 3612
 
0.9%
25 3067
 
0.8%
40 2949
 
0.8%
5 2611
 
0.7%
100 2128
 
0.5%
15 2084
 
0.5%
60 1123
 
0.3%
Other values (25864) 62526
 
15.9%
(Missing) 290802
74.0%
ValueCountFrequency (%)
0.01 24
< 0.1%
0.02 9
 
< 0.1%
0.03 5
 
< 0.1%
0.04 8
 
< 0.1%
0.05 8
 
< 0.1%
0.06 4
 
< 0.1%
0.07 1
 
< 0.1%
0.08 3
 
< 0.1%
0.09 3
 
< 0.1%
0.1 11
< 0.1%
ValueCountFrequency (%)
73131.84 1
< 0.1%
48521.79 1
< 0.1%
45105.92 1
< 0.1%
45000 1
< 0.1%
44000 1
< 0.1%
43500 1
< 0.1%
40000 1
< 0.1%
36100 1
< 0.1%
36000 1
< 0.1%
35521.72 1
< 0.1%

NumLiableParties
Categorical

Distinct4
Distinct (%)< 0.1%
Missing122
Missing (%)< 0.1%
Memory size25.5 MiB
1.0
385808 
2.0
 
6650
3.0
 
151
4.0
 
4

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1177839
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
1.0 385808
98.2%
2.0 6650
 
1.7%
3.0 151
 
< 0.1%
4.0 4
 
< 0.1%
(Missing) 122
 
< 0.1%

Length

2023-10-17T11:11:17.149757image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-10-17T11:11:17.242861image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
ValueCountFrequency (%)
1.0 385808
98.3%
2.0 6650
 
1.7%
3.0 151
 
< 0.1%
4.0 4
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
. 392613
33.3%
0 392613
33.3%
1 385808
32.8%
2 6650
 
0.6%
3 151
 
< 0.1%
4 4
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 785226
66.7%
Other Punctuation 392613
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 392613
50.0%
1 385808
49.1%
2 6650
 
0.8%
3 151
 
< 0.1%
4 4
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
. 392613
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1177839
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 392613
33.3%
0 392613
33.3%
1 385808
32.8%
2 6650
 
0.6%
3 151
 
< 0.1%
4 4
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1177839
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 392613
33.3%
0 392613
33.3%
1 385808
32.8%
2 6650
 
0.6%
3 151
 
< 0.1%
4 4
 
< 0.1%

CustomerAge
Real number (ℝ)

Distinct126
Distinct (%)< 0.1%
Missing26490
Missing (%)6.7%
Infinite0
Infinite (%)0.0%
Mean45.938066
Minimum-41
Maximum133
Zeros0
Zeros (%)0.0%
Negative20
Negative (%)< 0.1%
Memory size6.0 MiB
2023-10-17T11:11:17.328275image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum-41
5-th percentile29
Q136
median44
Q354
95-th percentile70
Maximum133
Range174
Interquartile range (IQR)18

Descriptive statistics

Standard deviation12.862101
Coefficient of variation (CV)0.27998787
Kurtosis0.65817338
Mean45.938066
Median Absolute Deviation (MAD)9
Skewness0.78275274
Sum16824587
Variance165.43365
MonotonicityNot monotonic
2023-10-17T11:11:17.426708image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
38 13936
 
3.5%
37 13488
 
3.4%
39 13027
 
3.3%
36 12532
 
3.2%
40 12353
 
3.1%
41 12015
 
3.1%
35 11286
 
2.9%
42 11282
 
2.9%
34 10880
 
2.8%
43 10654
 
2.7%
Other values (116) 244792
62.3%
(Missing) 26490
 
6.7%
ValueCountFrequency (%)
-41 1
 
< 0.1%
-31 3
< 0.1%
-30 1
 
< 0.1%
-29 1
 
< 0.1%
-28 5
< 0.1%
-22 1
 
< 0.1%
-20 2
 
< 0.1%
-6 1
 
< 0.1%
-4 1
 
< 0.1%
-3 4
< 0.1%
ValueCountFrequency (%)
133 2
 
< 0.1%
132 1
 
< 0.1%
121 4
 
< 0.1%
120 12
< 0.1%
119 1
 
< 0.1%
115 1
 
< 0.1%
114 2
 
< 0.1%
113 3
 
< 0.1%
112 2
 
< 0.1%
111 10
< 0.1%

NumPhones
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.43395929
Minimum0
Maximum8
Zeros266389
Zeros (%)67.8%
Negative0
Negative (%)0.0%
Memory size6.0 MiB
2023-10-17T11:11:17.514522image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile2
Maximum8
Range8
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.71857973
Coefficient of variation (CV)1.655869
Kurtosis3.190096
Mean0.43395929
Median Absolute Deviation (MAD)0
Skewness1.759986
Sum170431
Variance0.51635683
MonotonicityNot monotonic
2023-10-17T11:11:17.585464image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
0 266389
67.8%
1 90147
 
23.0%
2 29282
 
7.5%
3 6131
 
1.6%
4 633
 
0.2%
5 128
 
< 0.1%
6 22
 
< 0.1%
8 2
 
< 0.1%
7 1
 
< 0.1%
ValueCountFrequency (%)
0 266389
67.8%
1 90147
 
23.0%
2 29282
 
7.5%
3 6131
 
1.6%
4 633
 
0.2%
5 128
 
< 0.1%
6 22
 
< 0.1%
7 1
 
< 0.1%
8 2
 
< 0.1%
ValueCountFrequency (%)
8 2
 
< 0.1%
7 1
 
< 0.1%
6 22
 
< 0.1%
5 128
 
< 0.1%
4 633
 
0.2%
3 6131
 
1.6%
2 29282
 
7.5%
1 90147
 
23.0%
0 266389
67.8%

NumEmails
Real number (ℝ)

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.21530039
Minimum0
Maximum5
Zeros312660
Zeros (%)79.6%
Negative0
Negative (%)0.0%
Memory size6.0 MiB
2023-10-17T11:11:17.658329image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum5
Range5
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.44001087
Coefficient of variation (CV)2.0437068
Kurtosis3.2364063
Mean0.21530039
Median Absolute Deviation (MAD)0
Skewness1.8833761
Sum84556
Variance0.19360957
MonotonicityNot monotonic
2023-10-17T11:11:17.727010image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 312660
79.6%
1 75917
 
19.3%
2 3870
 
1.0%
3 257
 
0.1%
4 27
 
< 0.1%
5 4
 
< 0.1%
ValueCountFrequency (%)
0 312660
79.6%
1 75917
 
19.3%
2 3870
 
1.0%
3 257
 
0.1%
4 27
 
< 0.1%
5 4
 
< 0.1%
ValueCountFrequency (%)
5 4
 
< 0.1%
4 27
 
< 0.1%
3 257
 
0.1%
2 3870
 
1.0%
1 75917
 
19.3%
0 312660
79.6%

NumAddresses
Real number (ℝ)

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.84391765
Minimum0
Maximum7
Zeros76185
Zeros (%)19.4%
Negative0
Negative (%)0.0%
Memory size6.0 MiB
2023-10-17T11:11:17.795744image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median1
Q31
95-th percentile1
Maximum7
Range7
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.46529122
Coefficient of variation (CV)0.55134671
Kurtosis3.0576544
Mean0.84391765
Median Absolute Deviation (MAD)0
Skewness-0.19020604
Sum331436
Variance0.21649592
MonotonicityNot monotonic
2023-10-17T11:11:17.864040image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
1 303168
77.2%
0 76185
 
19.4%
2 12086
 
3.1%
3 1127
 
0.3%
4 139
 
< 0.1%
5 22
 
< 0.1%
6 7
 
< 0.1%
7 1
 
< 0.1%
ValueCountFrequency (%)
0 76185
 
19.4%
1 303168
77.2%
2 12086
 
3.1%
3 1127
 
0.3%
4 139
 
< 0.1%
5 22
 
< 0.1%
6 7
 
< 0.1%
7 1
 
< 0.1%
ValueCountFrequency (%)
7 1
 
< 0.1%
6 7
 
< 0.1%
5 22
 
< 0.1%
4 139
 
< 0.1%
3 1127
 
0.3%
2 12086
 
3.1%
1 303168
77.2%
0 76185
 
19.4%

FrequencyEncodedCreditor
Real number (ℝ)

Distinct50
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.10031669
Minimum2.4604907 × 10-6
Maximum0.20857087
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.0 MiB
2023-10-17T11:11:17.958491image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Quantile statistics

Minimum2.4604907 × 10-6
5-th percentile0.0058239814
Q10.040999156
median0.061172719
Q30.13612911
95-th percentile0.20857087
Maximum0.20857087
Range0.20856841
Interquartile range (IQR)0.095129951

Descriptive statistics

Standard deviation0.070767034
Coefficient of variation (CV)0.70543626
Kurtosis-1.3325713
Mean0.10031669
Median Absolute Deviation (MAD)0.05902225
Skewness0.33599867
Sum39397.877
Variance0.0050079731
MonotonicityNot monotonic
2023-10-17T11:11:18.060722image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.2085708732 82764
21.1%
0.1350686354 54845
14.0%
0.1361291069 52340
13.3%
0.06117271906 24854
 
6.3%
0.05032687619 20454
 
5.2%
0.04659431184 18922
 
4.8%
0.04242378015 17242
 
4.4%
0.04099915605 16443
 
4.2%
0.03920545835 15933
 
4.1%
0.03562790492 14474
 
3.7%
Other values (40) 74464
19.0%
ValueCountFrequency (%)
2.460490671 × 10-63
 
< 0.1%
4.920981342 × 10-62
 
< 0.1%
9.841962684 × 10-64
 
< 0.1%
1.476294403 × 10-56
 
< 0.1%
1.968392537 × 10-58
 
< 0.1%
3.198637872 × 10-513
 
< 0.1%
7.381472013 × 10-530
 
< 0.1%
0.0001451689496 59
< 0.1%
0.0001919182723 78
< 0.1%
0.000201760235 82
< 0.1%
ValueCountFrequency (%)
0.2085708732 82764
21.1%
0.1361291069 52340
13.3%
0.1350686354 54845
14.0%
0.06117271906 24854
 
6.3%
0.05438176481 13750
 
3.5%
0.05032687619 20454
 
5.2%
0.04659431184 18922
 
4.8%
0.04242378015 17242
 
4.4%
0.04099915605 16443
 
4.2%
0.03920545835 15933
 
4.1%

Interactions

2023-10-17T11:11:10.625020image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:10:53.940944image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:10:55.869396image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:10:57.525250image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:10:59.863844image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:01.894309image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:03.308402image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:05.112200image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:07.384743image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:08.994682image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:10.787061image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:10:54.156878image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:10:56.037277image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:10:57.682631image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:00.080099image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:02.035205image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:03.489909image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:05.293118image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:07.544675image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:09.155733image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:10.946856image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:10:54.321830image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:10:56.213810image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:10:57.854275image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:00.277784image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:02.151527image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:03.665372image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:05.504333image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:07.705618image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:09.321818image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:11.113147image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:10:54.540897image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:10:56.386223image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:10:58.115223image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:00.479628image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:02.283196image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:03.881628image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:06.179280image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:07.874006image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:09.499064image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:11.222931image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:10:54.659948image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:10:56.510272image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:10:58.227871image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:00.607982image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:02.413675image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:04.013415image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:06.301313image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:07.982075image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:09.608561image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:11.403811image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:10:54.942834image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:10:56.693545image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:10:58.516404image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:00.829377image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:02.538016image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:04.203941image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:06.510968image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:08.162343image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:09.787696image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:11.564444image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:10:55.133407image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:10:56.859355image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:10:58.746107image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:01.132082image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:02.677966image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:04.373366image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:06.686200image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:08.339236image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:09.961661image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:11.733325image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:10:55.306630image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:10:57.028845image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:10:59.230895image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:01.398346image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:02.821705image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:04.552246image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:06.855097image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:08.504001image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:10.135903image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:11.891086image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:10:55.505745image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:10:57.187735image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:10:59.446748image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:01.596537image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:02.962169image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:04.738196image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:07.028075image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:08.665994image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:10.295594image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:12.083824image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:10:55.674722image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:10:57.365401image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:10:59.664795image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:01.775672image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:03.106244image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:04.942553image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:07.210383image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:08.833819image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
2023-10-17T11:11:10.463151image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/

Correlations

2023-10-17T11:11:18.170593image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
CurrentBalanceDebtLoadPrincipalBalanceAtDebtLoadPurchasePriceLastPaymentAmountCustomerAgeNumPhonesNumEmailsNumAddressesFrequencyEncodedCreditorOriginalCreditor_RedactedProductOrDebtTypeCollectionStatusIsStatBarredInBankruptcyIsLegalNumLiableParties
CurrentBalance1.0000.6790.6720.169-0.273-0.088-0.290-0.002-0.206-0.2360.0460.0360.0170.0040.0410.0340.056
DebtLoadPrincipal0.6791.0000.9990.3790.111-0.0840.1020.246-0.103-0.4610.0510.0370.0070.0000.0180.0000.048
BalanceAtDebtLoad0.6720.9991.0000.3860.114-0.0820.1080.243-0.103-0.4720.0500.0370.0070.0000.0180.0000.048
PurchasePrice0.1690.3790.3861.0000.050-0.1160.2070.283-0.048-0.6590.8410.3290.1780.2050.0480.2290.037
LastPaymentAmount-0.2730.1110.1140.0501.0000.0160.1530.0940.063-0.0470.0460.0400.0280.0250.0100.0770.023
CustomerAge-0.088-0.084-0.082-0.1160.0161.000-0.050-0.1960.0620.1170.1310.0780.0680.1210.0160.0300.015
NumPhones-0.2900.1020.1080.2070.153-0.0501.0000.3760.150-0.2760.1590.1250.2060.5640.1060.1000.121
NumEmails-0.0020.2460.2430.2830.094-0.1960.3761.0000.048-0.2910.2340.2090.1680.4130.0430.0720.128
NumAddresses-0.206-0.103-0.103-0.0480.0630.0620.1500.0481.0000.1020.1530.0970.1260.1630.0920.0390.224
FrequencyEncodedCreditor-0.236-0.461-0.472-0.659-0.0470.117-0.276-0.2910.1021.0001.0000.5300.3080.2530.0560.1680.055
OriginalCreditor_Redacted0.0460.0510.0500.8410.0460.1310.1590.2340.1531.0001.0000.7890.2740.4730.1060.3500.204
ProductOrDebtType0.0360.0370.0370.3290.0400.0780.1250.2090.0970.5300.7891.0000.1750.4810.0680.1930.073
CollectionStatus0.0170.0070.0070.1780.0280.0680.2060.1680.1260.3080.2740.1751.0000.7940.4780.5370.049
IsStatBarred0.0040.0000.0000.2050.0250.1210.5640.4130.1630.2530.4730.4810.7941.0000.1040.0800.049
InBankruptcy0.0410.0180.0180.0480.0100.0160.1060.0430.0920.0560.1060.0680.4780.1041.0000.0190.007
IsLegal0.0340.0000.0000.2290.0770.0300.1000.0720.0390.1680.3500.1930.5370.0800.0191.0000.046
NumLiableParties0.0560.0480.0480.0370.0230.0150.1210.1280.2240.0550.2040.0730.0490.0490.0070.0461.000

Missing values

2023-10-17T11:11:12.599479image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-10-17T11:11:13.288688image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-10-17T11:11:14.652219image/svg+xmlMatplotlib v3.5.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

OriginalCreditor_RedactedCurrentBalanceDebtLoadPrincipalBalanceAtDebtLoadPurchasePriceProductOrDebtTypeCollectionStatusIsStatBarredInBankruptcyIsLegalLastPaymentAmountNumLiablePartiesCustomerAgeNumPhonesNumEmailsNumAddressesFrequencyEncodedCreditor
0Creditor 10.001160.201160.204.22OtherPAID_IN_FULLNNY10.001.053.00010.000020
1Creditor 2182.90182.90182.904.22OtherCANCELLED_WITHDRAWNYNNNaN1.0NaN0010.000015
2Creditor 10.00538.57538.574.22OtherPAID_IN_FULLNNN5.371.0NaN1010.000020
3Creditor 28279.508279.508279.504.22OtherPASSIVEYNNNaN1.0NaN1010.000015
4Creditor 10.00523.00523.004.22OtherPAID_IN_FULLYNY5.001.046.02010.000020
5Creditor 11118.74790.30790.304.22OtherPASSIVEYNY10.001.0NaN0010.000020
6Creditor 10.0071.8971.894.22OtherPAID_IN_FULLNNY91.271.050.02010.000020
7Creditor 20.0011091.3511091.354.22OtherPAID_IN_FULLNNY1200.001.0NaN1010.000015
8Creditor 1481.34404.67404.674.22OtherCLOSEDYNN20.001.0NaN1000.000020
9Creditor 10.00903.76903.764.22OtherPAID_IN_FULLNNY8.261.046.01010.000020
OriginalCreditor_RedactedCurrentBalanceDebtLoadPrincipalBalanceAtDebtLoadPurchasePriceProductOrDebtTypeCollectionStatusIsStatBarredInBankruptcyIsLegalLastPaymentAmountNumLiablePartiesCustomerAgeNumPhonesNumEmailsNumAddressesFrequencyEncodedCreditor
406413Creditor 50595.76595.76595.767.38Finance Company - OtherACTIVENNNNaN1.049.01110.032577
406414Creditor 501749.521749.521749.527.38Finance Company - OtherUNDER_ARRANGEMENTNNNNaN1.040.02110.032577
406415Creditor 504221.234221.234221.237.38Finance Company - OtherACTIVENNNNaN1.059.01110.032577
406416Creditor 501386.811386.811386.817.38Finance Company - OtherACTIVENNNNaN1.038.02110.032577
406417Creditor 508246.728246.728246.727.38Finance Company - OtherACTIVENNNNaN1.034.01110.032577
406418Creditor 50448.20448.20448.207.38Finance Company - OtherACTIVENNNNaN1.036.01110.032577
406419Creditor 501678.371678.371678.377.38Finance Company - OtherACTIVENNNNaN1.037.00110.032577
406420Creditor 503512.603512.603512.607.38Finance Company - OtherACTIVENNNNaN1.071.01110.032577
406421Creditor 504477.314477.314477.317.38Finance Company - OtherACTIVENNNNaN1.038.01110.032577
406422Creditor 50272.59272.59272.597.38Finance Company - OtherACTIVENNNNaN1.030.00110.032577